Apparatus and method for detecting adaptive motion direction

ABSTRACT

In an adaptive motion direction detecting apparatus, an image input unit inputs two-dimensional pixel data of an object. A response output unit includes a plurality of response element arrays each having different time phases, and each of the response element arrays includes a plurality of response elements. Each of the response elements generates a response output for one of a plurality of local areas partly superposed thereon, and the two-dimensional pixel data is divided into the local areas. A correlation function calculating unit calculates spatial and time correlation functions between the response outputs of the response elements. A response output selecting unit selects response outputs of the response output unit for each of the local areas in accordance with the spatial and time correlation functions. A motion direction detecting unit includes a plurality of detection elements each corresponding to the response elements. Each of the detection elements detects a motion direction of the object at one of the local areas in accordance with selected response outputs therefor.

CROSS-REFERENCE TO RELATED APPLICATIONS

This is a continuation of U.S. patent application Ser. No. 10/027,007, filed Dec. 26, 2001, in the name of Masanobu MIYASHITA.

BACKGROUND OF THE INVENTION

1. Field of the Invention

The present invention relates to an apparatus and method for detecting an adaptive motion direction of an image (object).

2. Description of the Related Art

In a first prior art adaptive motion direction detecting method, the following steps are carried out:

1) first two-dimensional pixel data of an object from a pickup camera for preset areas at time t1 is fetched and stored in a first frame memory;

2) second two-dimensional pixel data of the object from the pickup camera for the preset areas at time t2 (>t1) is fetched and stored in a second frame memory;

3) differences between the first and second two-dimensional pixel data stored in the first and second frame memories are calculated; and

4) an image of the object at time t3 (>t2) is estimated to detect a motion direction of the contour of the object.

The above-described first prior art adaptive motion direction detecting method is disclosed in JP-A-6-96210 and JP-8-44852.

In the first prior art adaptive motion direction detecting method, however, since the image of the object has a bright contour and a dark contour which are generally different from each other, the calculated differences between the first and second two-dimensional pixel data depend upon the deviation between the bright contour and the dark contour, so that it is difficult to accurately determine the motion direction of the object.

In the above-described first prior art adaptive motion direction detecting method, in order to accurately determine the motion direction of an object independent of a bright contour and a dark contour thereof, two-dimensional pixel data in each of the present areas are normalized by their maximum lightness and their minimum lightness to form bright contour data and the dark contour data. Then, motion directions of the object are determined in accordance with the bright contour data and dark contour data, respectively. On the other hand, two-dimensional pixel data in each of the preset areas is caused to be binary data “1” when the lightness thereof is larger than a definite lightness, and two-dimensional pixel data in each of the preset areas is caused to be binary data “0” when the lightness thereof is not larger than the definite lightness. However, these techniques deteriorate the efficiency of motion detection. In addition, the presence of the two frame memories does not save the resource of memories.

A second prior art adaptive motion direction detecting method, spatial and time convoluting integrations are performed upon N-successive images for each local area at times t, t+Δt, t+2Δt, . . . , t+NΔt, to obtain an optical flow, thus detecting a speed of an object (see: JP-A-7-192135).

In the above-described second prior art adaptive motion direction detecting method, however, since means for determining all gradients of the object are required, the resource thereof cannot be saved. That is, images statistically include horizontal displacement; even in this case, spatial and tie convoluting interactions for all the directions have to be carried out, which decreases the efficiency of the resource thereof.

SUMMARY OF THE INVENTION

It is an object of the present invention to provide an apparatus and method for detecting an adaptive motion direction capable of enhancing the resource efficiency and saving the resource of memories.

According to the present invention, in an adaptive motion direction detecting apparatus, an image input unit inputs two-dimensional pixel data of an object. A response output unit includes a plurality of response element arrays each having difference time phases, and each of the response element arrays includes a plurality of response elements. Each of the response elements generates a response output for one of a plurality of local areas partly superposed thereto. In this case, the two-dimensional pixel data is divided into the local areas. A correlation function calculating unit calculates spatial and time correlation functions between the response outputs of the response elements. A response output selecting unit for selects response outputs of the response output unit for each of the local areas in accordance with the spatial and time correlation functions. A motion direction detecting unit includes a plurality of detection elements each corresponding to the response elements. Each of the detection elements detects a motion direction of the object at one of the local areas in accordance with selected response outputs therefor.

Also, in an adaptive motion direction detecting method, two-dimensional pixel data of an object is inputted. Then, a response output is generated for one of a plurality of local areas partly superposed thereon. In this case, the two-dimensional pixel data is divided into the local areas. Then, spatial and time correlation functions between the response outputs are calculated. Then, response outputs are selected for each of the local areas in accordance with the spatial and time correlation functions. Finally, a motion direction of the object at one of the local areas is detected in accordance with selected response outputs for the one of the local areas.

BRIEF DESCRIPTION OF THE DRAWINGS

The present invention will be more clearly understood from the description set forth below, as compared with the prior art, with reference to the accompanying drawings, wherein:

FIG. 1 is a block circuit diagram illustrating an embodiment of the adaptive motion direction detecting apparatus according to the present invention;

FIG. 2 is a diagram of pixel data inputted by the image input unit 1 of FIG. 1;

FIG. 3 is a diagram of the response output unit of FIG. 1;

FIG. 4 is a diagram of the response selecting unit and the motion direction detecting unit of FIG. 1;

FIG. 5A is a diagram illustrating an example of the output of one detection element of FIG. 4;

FIG. 5B is a diagram of an arrow showing the gradient of the output of the detection element of FIG. 5A;

FIGS. 6 and 7 are diagrams of arrows showing examples of the gradients of the outputs of all the detection elements of FIG. 4; and

FIG. 8 is a diagram of a memory into which the tables of FIG. 4 are combined.

DESCRIPTION OF THE PREFERRED EMBODIMENT

In FIG. 1, which illustrates an embodiment of the adaptive motion direction detecting apparatus according to the present invention, an image input unit 1, a response output unit 2, a correlation function calculating unit 3, a response selecting unit 4, a motion direction detecting unit 5, and an output unit 6 are provided. In this case, the response output unit 2, the correlation function calculating unit 3 and the response selecting unit 4 are constructed by a computer including a central processing unit (CPU), memories and the like.

The image input unit 1 is constructed by a charge coupled device (CCD) camera, for example, which inputs two-dimensional pixel data 101 such as 1000×1000 pixel data having coordinate k (k=1 to 1000000) as shown in FIG. 2. In this case, pixel data of coordinate k at time t is represented by f_(k)(t).

The response output unit 2 is constructed by two-dimensional response element arrays 201, 202, 203 and 204 each having 100×100 response elements as shown in FIG. 3. In this case, each of the response elements has coordinate m .m=1 to 10000. Also, each of the response element arrays 201, 202, 203 and 204 has coordinate n (n=1 to 4).

That is, the two-dimensional data area of the pixel data is divided into local areas A₁, A₂, . . . , A_(m), . . . , A₁₀₀₀₀, which are partly superposed on each other. Then, each of the response elements m (m=1, 2, . . . , 10000) receives pixel data f_(k)(t) of the local area A_(m). Note that the local areas A₁, A₂, . . . , A_(m), . . . , A₁₀₀₀₀ are realized by performing a Gaussian function upon the pixel data f_(k)(t).

The coordinate n corresponds to a response phase. In this case, the difference in phase between the coordinate n is 90°.

A response function R_(m, k, n) (t) of the response element defined by coordinate m and coordinate n is represented by: R _(m, k, n) (t)=S _(m, k) . T _(n)(t)  (1)

where

-   -   S_(m, k)(t) is a spatial response function; and     -   T_(n)(t) is a time response function.

Also, the spatial response function S_(m, k)(t) is represented by: $\begin{matrix} {S_{m,k} = {{{1/\left( {2{\pi\lambda}_{ex}^{2}} \right)} \cdot {\exp\left( {{- d_{m,k}^{2}}/\left( {2\lambda_{ex}^{2}} \right)} \right)}} - {{\kappa/\left( {2{\pi\lambda}_{inh}^{2}} \right)} \cdot {\exp\left( {{- d_{m,k}^{2}}/\left( {2\lambda_{inh}^{2}} \right)} \right)}}}} & (2) \end{matrix}$

where

λ_(ex) and λ_(inh) are coefficients for showing spatial spread of the pixel data for one response element (λ_(ex)<λ_(inh), for example, λ_(ex):λ_(inh)=0.25:1); and

d_(m, k) is a distance between the coordinate m and the coordinate k.

Note that the coordinate m is defined by a center of the local area A_(m) of the coordinate k (see: dots of the pixel data 101 of FIG. 3). Therefore, the distance d_(m, k) is represented by: d _(m, k) ²=(x _(m) −x _(k))²+(y _(m) −y _(k))²   (3)

where

x_(m) and y_(m) are an X-direction value and a Y-direction value, respectively, of the coordinate m; and

x_(k) and y_(k) are an X-direction value and a Y-direction value, respectively, of the coordinate k.

The time response function T_(n) (t) is represented by: T ₁ (t)=exp(−t/λ _(t)).sin(ωt).⊖.t(T _(o) −t).  (4) T ₂ (t)=−exp(−(T ₀ + T ₀ −t)/λ_(t)).sin.ω(t− T ₀)..⊖.t− T ₀.(T ₀ + T ₀ −t).   (5) T ₃ (t)=−T ₁ .t)  (6) T ₄ (t)=−T ₂ (t)   (7)

where

λ_(t) is a coefficient for showing a spread of time response;

T₀ is a period of the time response;

T ₀ is a delay time;

ω is an angular frequency (=2π/T₀); and

⊖ is a step function.

The response element at coordinate m and coordinate n generates the following response output η_(m,n)(t): η_(m,n)(t)=∫dt′ΣR _(m,k,n)(t′).f _(k)(t−t′)   (8)

The correlation function calculating unit 3 calculates a correlation function ┌_(m, n; m′, n′) between the response outputη_(m,n)(t) of a response element having coordinate m and coordinate n and the response outputη_(m′,n′)(t) of a response element having coordinate m′ and coordinate n′ by: $\begin{matrix} \begin{matrix} {\Gamma_{m,{n;m^{\prime}},n^{\prime}} = {< {{\eta_{m,n}(t)} \cdot {\eta_{m^{\prime},n^{\prime}}(t)}} >_{t}}} \\ {= {\int\quad{{\mathbb{d}t}\quad{{\eta_{m,n}(t)} \cdot {\eta_{m^{\prime},n^{\prime}}(t)}}}}} \end{matrix} & (9) \end{matrix}$

where < >_(t) means an average value with respect to time.

The response selecting unit 4 selects response outputs having high spatial and time correlations from the response outputs η_(m,n)(t) of the response output unit 2 in accordance with the correlation functions ┌_(m, n; m′, n′) calculated by the correlation function calculating unit 3. Also, the motion direction detecting unit 5 detects a motion direction of an image in accordance with the output of the response selecting unit 4.

As illustrated in FIG. 4, the motion direction detecting unit 5 is constructed by a two-dimensional detection element array having 100×100 detection elements having coordinate i. Also, the response selecting unit 4 is constructed by tables 401, . . . , 40 i′, . . . for storingσ_(i, j, m, n) corresponding to the detection elements of coordinate i (i=1 to 10000).

The response selecting unit 4 is operated so that the following cost function E is minimum: E=ΣV_(i,i′)E_(i′)  (10) E_(i′)=−ΣΣσ_(i, j, m, n).σ_(i′,j′, m′, n′)┌_(., . . . m′,n′)  (11) $\begin{matrix} {V_{i,i^{\prime}} = {{{1/\left( {2{\pi\lambda}_{1}^{2}} \right)} \cdot {\exp\left( {{- d_{i,i^{\prime}}^{2}}/\left( {2\lambda_{1}^{2}} \right)} \right)}} - {{1/\left( {2{\pi\lambda}_{2}^{2}} \right)} \cdot {\exp\left( {{- d_{i,i^{\prime}}^{2}}/\left( {2\lambda_{2}^{2}} \right)} \right)}}}} & (12) \end{matrix}$

where

λ₁ and λ₂ are coefficients (λ_(ex)<λ_(inh), for example, λ_(ex):λ_(inh)=0.25:1); and

d_(i, i′) is a distance between the coordinate i and the coordinate i′.

Note that the coordinates i and i′ are defined by a center of the local area of the coordinates i and i′, respectively. Therefore, the distance d_(i, i′) is represented by: d _(i, i′) ²=(x _(i) −x _(i′))²+(y _(i) −y _(i′))²  (13)

where

x_(i) and y_(i) are an X-direction value and a Y-direction value, respectively, of the coordinate i.

x_(i′) and y_(i′) are an X-direction value and a Y-direction value, respectively, of the coordinate i′;

Also, the output η_(i)(t) of the detection element of coordinate i is represented by: η_(i)(t)=ΣR _(i,.)(t).f _(.)(t)  (14) R _(i, k)(t)=Σσ_(i, ., m, n) .R _(m, k, n)(t)  (15)

Note that, only when the response output η_(m, n)(t) from the response output unit 2 is received by the response selecting unit 4, is σ_(i, j, m, n) “1”. Otherwise, σ_(i, j, m, n) is “0”.

Here, consider that f_(.)(t) is a grating pattern or a sinusoidal wave-shaped gray pattern given by: $\begin{matrix} {{f\left( {\left. t \middle| \theta \right.,f_{s},f_{t}} \right)} = {{\cos\quad 2\pi\quad{f_{s}\left( {{x_{k}\cos\quad\theta} + {y_{k}\sin\quad\theta}} \right)}} - {2\pi\quad f_{t}{t.}}}} & (16) \end{matrix}$

where

θ is a gradient of the grating pattern corresponding to an angle of a motion direction of an object;

f_(s) is a space frequency of the grating pattern; and

f_(t) is a time frequency of the grating pattern. In this case, the formula (14) is replaced by: $\begin{matrix} {{\eta_{i}\left( {\left. t \middle| \theta \right.,f_{s},f_{t}} \right)} = {\int\quad{{\mathbb{d}t^{\prime}}\Sigma\quad{R_{i,k}\left( t^{\prime} \right)}{f_{k}\left( {\left. {t^{\prime} - t} \middle| \theta \right.,f_{s},f_{t}} \right)}}}} & (17) \end{matrix}$

In the formula (17), the value of θ gives a direction selectivity of the detection element of coordinate i whenη_(i)(t|θ, f_(s), f_(t)) is maximum. In other words, if θ does not show the direction selectivity of the detection element of coordinate i,η_(i)(t|θ, f_(s), f_(t)) is very small and can be negligible. Thus, if pixel data f_(k)(t) is given, the motion direction of an image included in the pixel data f_(k)(t) can be detected by the output η_(i)(t|θ, f_(s), f_(t)) of the detection elements having coordinate i.

The operation of the adaptive motion direction detecting apparatus of FIG. 1 is explained simply below.

First, the image input unit 1 receives pixel data 101 as shown in FIG. 2. In this case, each of pixel data 101 is a gray signal.

The response output unit 2 divides two-dimensional data area of the pixel data into a plurality of local areas A₁, A₂, . . . , A_(k), . . . as shown in FIG. 3. Then, the response output unit 2 generates the response outputη_(m,n) (t) for each of the local areas A₁, A₂, . . . , A_(k), . . . .

The correlation function calculating unit 3 calculates spatial and time correlation functions ┌_(m, n; m′, n′) between the response outputs of the response output unit 2.

The response selecting unit 4 selects response outputs η_(m,n) (t) of the response outputs unit 2 in accordance with the spatial and time correlation function ┌_(m, n; m′, n′).

The motion direction detecting unit 5 detects a motion direction in accordance with the selection result, η_(i, j, m, n) of the response selecting unit 4 and the response outputη_(m,n) (t) of the response output unit 2.

In FIG. 5A, which shows an example of the outputη_(i)(t) of one detection element of coordinate i of FIG. 4, depicted on the output unit 6, a solid line shows a bright contour of an object, while a dotted line shows a dark contour of the object. In this case, the bright contour and the dark contour are both moving along the same direction of a gradient θ as indicated by FIG. 5B.

FIGS. 6 and 7 are diagrams showing examples of arrows of the gradients of the outputs of all the detection elements of FIG. 4 where the number of detection elements is 50×50. Particularly, FIG. 7 shows a case where an object is moving around 0° and 180°.

In the above-described embodiments, when an object (image) is moving, the number of η_(i,j, m, n) (=“1”) selected by the response selecting unit 4 is increased. On the other hand, when an object (image) is not moving, the number of η_(i,j, m, n) (=“1”) selected by the response selecting unit 4 is decreased. Note that the tables 401, . . . , 40 i′, . . . of FIG. 4 are used for storingη_(i,j, m, n) regardless of its value “1” or “0”; however, the tables 401, . . . , 40 i′, . . . can be combined into a memory as shown in FIG. 8 which stores only coordinates i, j, m and n of η_(i,j, m, n) whose value is “1”. The memory of FIG. 8 can be reduced in size as compared with all of the tables 401, . . . 40 i, . . . of FIG. 4, which is helpful in saving the resource of memories.

In the above-described embodiment, when the response element arrays 201, 202, 203 and 204 are constructed so as to have a response time phase of 90°, the detection elements of the motion direction detecting unit 5 are operated in accordance with the response outputs of the response element arrays 201, 202, 203 and 204 to obtain a motion direction angle from 0° to 360° as shown in FIG. 5A. This has been mathematically proved by the formulae (1) to (7).

Also, in the above-described embodiment, since one two-dimensional pixel data area is divided into partly-superposed local areas A₁, A₂, . . . , A_(m), . . . as shown in FIG. 3, the response output of the response elements of one local area includes a part of the response output of the response element of its adjacent local areas, which is helpful in enhancing the detection accuracy of motion directions.

In the prior art adaptive motion direction apparatus where two frame memories are used for storing image data at different timings, the frame memories require a capacity of 1000×1000×2 bits (=2,000,000 bits). On the other hand, in the above-described embodiment,

1000×1000×1 bits (image input unit),

100×100×4 bits (response elements),

100×100×1 bits (detection elements) and,

100×100×20 bits (memory of FIG. 8) are required, i.e., only 1,250,000 bits are required. Thus, a remarkable reduction of memories can be expected.

Also, since it is unnecessary to compare two kinds of image data to detect a motion direction of an object (image), the efficiency of detection of a motion direction can be enhanced.

As explained hereinabove, according to the present invention, a motion direction of an object (image) can be detected independent of the object whose contour being bright or dark. Also, the detection resource efficiency of an accurate motion direction can be enhanced. Further, the detected motion direction of an object is statistically reflected by the motion of an input image of the object. Additionally, the resource of memories can be saved. 

1. An adaptive motion direction detecting apparatus comprising: an image input unit for inputting two-dimensional pixel data of an object; a response output unit including a plurality of response element arrays each having different time phases, each of said response element arrays including a plurality of response elements each generating a response output for one of a plurality of local areas partly superposed thereon, said two-dimensional pixel data being divided into said local areas; a correlation function calculating unit for calculating spatial and time correlation functions between the response outputs of said response elements; a response output selecting unit for selecting response outputs of said response output unit for each of said local areas in accordance with said spatial and time correlation functions; and a motion direction detecting unit including a plurality of detection elements each corresponding to one of said response elements of each of said response element arrays, each of said detection elements detecting a motion direction of said object at one of said local areas in accordance with selected response outputs for said one of said local areas.
 2. The apparatus as set forth in claim 1, wherein said time phases are 0°, 90°, 180° and 270°.
 3. The apparatus as set forth in claim 1, wherein said response input unit divides said two-dimensional pixel data into said local areas by performing a Gaussian function upon said two-dimensional pixel data.
 4. An adaptive motion direction detecting method comprising the steps of: inputting two-dimensional pixel data of an object; generating a response output for one of a plurality of local areas partly superposed thereon, said two-dimensional pixel data being divided into said local areas; calculating spatial and time correlation functions between the response outputs; selecting response outputs for each of said local areas in accordance with said spatial and time correlation functions; and detecting a motion direction of said object at one of said local areas in accordance with selected response outputs for said one of said local areas.
 5. The method as set forth in claim 4, wherein said response output has time phases of 0°, 90°, 180° and 270°.
 6. The method as set forth in claim 4, wherein said two-dimensional pixel data are divided into said local areas by performing a Gaussian function upon said two-dimensional pixel data. 